Extracting Generic Statements for the Semantic Web
نویسندگان
چکیده
Much of the natural language text found on the web contains various kinds of common sense knowledge and such information is potentially an important supplement to more formal approaches to building knowledge bases for Semantic Web applications. The common sense knowledge is often expressed in the form of generic statements such as “Elephants are mammals.” The generic statement refers to a sentence that talks about kinds rather than individuals. In this thesis, we develop methods of automatically extracting generic statements from unrestricted natural language text and mapping them into an appropriate logical formalism for Semantic Web. The extraction process utilises cascaded transduction rules for identifying generic sentences and extracting relations from them. NLP pipeline and a suite of XML tools designed for generic manipulation of XML were used to carry out the series of tasks. The Wikipedia XML corpus was adopted for development, as a rich source of generic statements, and we used existing annotations of the ACE 2005 corpus for testing the identification of generic terms. For identifying generic terms, we use a set of morpho-syntactic features coded into definite transduction rules and apply the rules to the noun groups resulting from chunking. Next, relations are extracted with those identified terms as arguments. The semantic interpretation for the relation extraction was based on what can be called semantic chunking. Finally, we show how these extracted relations can be converted to RDF(S) statements as a knowledge representation for Semantic Web.
منابع مشابه
Presenting a method for extracting structured domain-dependent information from Farsi Web pages
Extracting structured information about entities from web texts is an important task in web mining, natural language processing, and information extraction. Information extraction is useful in many applications including search engines, question-answering systems, recommender systems, machine translation, etc. An information extraction system aims to identify the entities from the text and extr...
متن کاملExtracting Common Sense Knowledge from Wikipedia
Much of the natural language text found on the web contains various kinds of generic or “common sense” knowledge, and this information has long been recognized by artificial intelligence as an important supplement to more formal approaches to building Semantic Web knowledge bases. Consequently, we are exploring the possibility of automatically identifying “common sense” statements from unrestri...
متن کاملIMPROVE THE RECOMMENDER SYSTEM USING SEMANTIC WEB
To buy his/her necessities such as books, movies, CD, music, etc., one always trusts others’ oral and written consultations and offers and include them in his/her decisions. Nowadays, regarding the progress of technologies and development of e-business in websites, a new age of digital life has been commenced with the Recommender systems. The most important objectives of these systems include a...
متن کاملAHP Techniques for Trust Evaluation in Semantic Web
The increasing reliance on information gathered from the web and other internet technologies raise the issue of trust. Through the development of semantic Web, One major difficulty is that, by its very nature, the semantic web is a large, uncensored system to which anyone may contribute. This raises the question of how much credence to give each resource. Each user knows the trustworthiness of ...
متن کاملA procedure for Web Service Selection Using WS-Policy Semantic Matching
In general, Policy-based approaches play an important role in the management of web services, for instance, in the choice of semantic web service and quality of services (QoS) in particular. The present research work illustrates a procedure for the web service selection among functionality similar web services based on WS-Policy semantic matching. In this study, the procedure of WS-Policy publi...
متن کامل